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INSPECTIONS  WITH  UNKNOWN  DETECTION  PROBABILITIES1 
(The  Proofreader  Problem) 

by 

Cyrus  Derman  Gerald  J*  Lieberman  Sheldon  M#  Ross 

Columbia  University  Stanford  University  Univ*  of  Cal.,  Berkeley 

1.  Introduction  and  Summary 

Suppose  in  an  acceptance  sampling  situation  the  lot  is  subject  to 
100%  inspection*  The  probability  that  a  defective  unit  is  detected  is 
different  for  each  inspector  and  is  unknown*  It  is  of  Interest  to 
estimate  N,  the  number  of  defective  units  in  the  lot  (presumably,  a 
decision  to  reject  or  accept  the  lot  would  be  based  on  the  estimate  of 
N).  Or,  suppose  satellites  are  used  for  surveillance  over  a  given  part  of 
the  earth  with  the  detection  of  certain  types  of  installations  being  the 
mission  of  a  given  satellite*  However,  for  various  reasons,  it  can  be 
assumed  the  detection  of  any  existing  installation  is  uncertain  with  an 
unknown  probability  of  detection  that  varies  among  satellites.  The 
problem  is  to  estimate  the  total  number  of  installations  based  on  the 
number  observed.  A  third  situation  involves  the  reading  of  a  manuscript 
by  many  proof readers •  Based  on  the  results,  it  may  be  of  interest  to 
estimate  the  total  number  of  typographical  errors* 

For  purposes  of  exposition  we  shall,  in  formulating  the  model,  use 
language  suggested  by  the  proofreader  situation* 
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The  proofreader  problem  has  been  treated:  Polya  [3]  and  Jewell  [2]* 

In  the  context  of  wildlife  recapture  census ,  the  literature  reaches  back 
to  the  1950's  (a  reference  list  appears  in  G.A.F.  Seber  (4])« 

Ve  develop  models  for  estimating  the  quantities  of  interest*  Our 
models  are  generalisations  of  what  has  appeared  in  the  wildlife  recapture 
census  and  proofreading  literature*  In  the  context  of  the  proofreading 
model  the  existing  literature  has  considered  the  situation  where  all 
K(K  >1)  readers  read  the  entire  manuscript*  In  our  models  we  allow  for 
the  possibility  that  the  manuscript  can  be  divided  into  several  chapters* 
Each  reader  reads  one  or  more,  but  not  necessarily  all,  chapters*  We  look 
at  this  generalization  in  two  ways*  The  first  model  is  multi-variate  with 
an  unknown  number  of  errors  in  each  chapter  to  be  estimated*  The  second 
model  assumes  that  the  number  of  errors  in  the  entire  manuscript  has  a 
Poisson  distribution  with  unknown  mean  and  that  the  relative  sizes  of  the 
chapters  are  known*  We  rely  on  the  method  of  maximum  likelihood  for 
estimating  the  unknown  parameters.  Typically  the  maximum  likelihood 
estimates  of  the  quantities  are  solutions  to  equations  which  must  be 
solved  numerically*  In  this  paper  we  are  not  concerned  with  the 
statistical  properties  of  these  estimates.  We  are  primarily  concerned 
with  the  convergence  properties  and  performance  of  an  Intuitive  iterative 
procedure  which,  given  the  present  generation  of  personal  computers,  can 
provide  the  desired  numerical  estimates  in  a  matter  of  seconds* 

2*  GENERAL  MODEL 

We  assume  a  "manuscript"  with  M,  M  j>  1,  "chapters"  and  K,  K  >  1 
"proofreaders"*  Each  proofreader  Is  assigned  a  number  of  chapters  to 


read*  Let  denote  the  set  of  proofreaders  assigned  to  read  chapter 

1,  1  ■  1,  •  •*,  M;  let  Lj  denote  the  set  of  chapters  assigned  to 
proofreader  j,  J  »  1,  ••*,  K;  let  denote  the  unknown  number  of 

-errors’*  In  chapter  1;  let  p^  denote  the  unknown  probability  of 
proofreader  j  detecting  a  given  error  when  he  MreadsM  It*  We  assume  In¬ 
dependence  from  error  to  error  so  that  the  number  of  errors  proofreader 
j»  j  “  1»  ••*,  K  finds  are  Independent  binomial  random  variables  with 

parameters  T  and  p.»  J  ■  1*  ••*,  K* 

ltL  J 

J 


Q  -  n  (i-p  ) 
1  j€Kt  J 


be  the  probability  that  a  given  error  in  chapter  i  will  not  be  found  by 
any  proofreader.  Let  n(j,i)  denote  the  number  of  errors  that 
proofreader  j  finds  in  chapter  i;  let  denote  the  total  number  of 

different  errors  found  in  chapter  i  by  all  of  the  proofreaders  assigned 
to  read  that  chapter.  The  likelihood  function  of  the  observed  data  given 
(N,p)  -  (N^  f  ###>  nm  ,  p1?  ...,  pK)  is  given  by 

M  V  VT<  n  T  -n(j,i) 

L(data  |  (N,p))  -  n  *  Q  1  n  (1-p  ) 

i-1  (N1-T1)!d1!  i  JfKi  J  j 


M  V, ! 


N,  K  p4  n(  j ) 


»»  ”  r  j 

-  n  JL  Tv-r  Q  n 
i-1  (Ni  Tp  di  1  j-1  1  pj 
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where  is  a  function  of  the  data  asaociated  with  chapter  1  and  does 

not  depend  on  (N,p)  and  where  n(j)  ■  l  n(j,i)  Is  the  total  number  of 

“Lj 

errors  found  by  proofreader  j. 

If  we  approximate  by  assuming  the  N^'s  to  be  continuous  variables 
and  substitute  log  N  for  d  on  partial  differentiation  of  log  L 

with  respect  to  the  l^'a  and  p^'a  we  obtain  equations  that  the  maximum 
likelihood  estimators  p^  of  Nj  and  p^  must  satisfy: 


” j  1"  It  *#•  i  H  t 


n(j  ) 

I  N. 

II  Li  1 


*1*  seepK  , 


where 


Q.  -  n  (1 -p  )  . 

1  J 


Combining  (2)  and  (3)  yields  the  equations 


N  -  - i — r-T - 1  i  -  lt  ...t  H 

1  i  -  n  (i- 

j«K.  I  N 

J  1  «i.  v 


vcLj  v 


In  addition  to  (4),  we  have  the  additional  constraints  that  j>  T^, 


i  ■  1,  •••,  Mg  which  implies f  also,  that 


r 


I 


For  given  values  (N^(0),  i  ■  1,  M}  define  for  u  ^  1, 


^(u+l) 


i  -  n  (i  -  1 — 

j£Ki  JL  V»> 


) 


i  *  lg  • • • i  M 


The  above  defined  iteration  is  suggested  by  (2),  where  initially  a  value 
for  is  given  which  in  turn  generates  a  value  for  which  in  turn 

generates  another  value  for  etc* 

Proposition  1;  If  1^(0)  «  i  ■  1,  ...»  M,  then  {^(u),  u  «  0,  ...} 
is  non-decreasing  in  u  for  every  i. 

Proof:  If  N^(0)  -  T^,  i  »  1,  000,  H,  it  is  clear  from  (4)  that  N^O)  J> 
N^(0),  i  *  1,  •  •*,  M.  However,  replacing  N^O)  by  N^O)  Increases  the 
right  side  of  (4)  which  means  that  1^(2)  >,  N  (1),  i  -  1,  .*.,  H. 
Continuing,  the  proposition  follows. 

The  monotonicity  of  Proposition  1  does  not  guarantee  that  lim  N  (u) 

u^cd  i 

exists;  i.e*,  we  could  have  N^(u)  *  00 •  Proposition  1  starts  from  the 
lowest  possible  value  of  N^#  The  following  proposition  starts  from  a 
high  value  of  and  asserts  oonotonicity  in  the  opposite  direction* 


Proposition  2;  Suppose  N^(0)  *  T^N.  If 


then  there  exists  N  large  enough  such  that  {N^(u)f  u  ■  0,  1,  •«•)  is 
non-incr easing  for  every  i.  Consequently,  {N^(u)f  i  *  1»  •  ••!  H; 
u  *  0,  •••}  converges  to  a  solution  of  (4), 

Proof ;  Por  N  large  enough,  the  left  side  of  (5)  is  the  dominating  term, 
for  each  i,  in  the  denominator  of  (4).  For  N  large  enough  one  gets 
that  N^(l)  <^N^(0),  i  ■  1,  ...,  M.  By  the  same  argument  used  in  the 
proof  of  Proposition  1,  we  get  that  N^u+l)  <  N^u),  i  -  1,  ...»  M* 

Since  the  N^(u)*s  are  bounded  below  by  for  every  i  the  sequences 

must  each  have  a  limit.  That  the  limit  satisfies  (4)  follows  by  the 
continuity  of  the  functions  involved  in  (4). 

What  still  is  an  open  question  is  whether,  or  under  what  conditions, 
(4)  has  a  unique  solution  in  the  region  When  uniqueness  can  be 

established  then  that  limit  arrived  at  in  Proposition  2  can  be  taken 

A 

to  be  i  ■  1,  M,  and  at  the  same  time  yielding  the  values 

Pj,  J  *  1*  • • • t  K* 

Remark:  There  is  a  simple  heuristic  argument  that  also  leads  to  the 
estimators  provided  by  (4).  As 

+  Number  of  errors  missed  in  chapter  i 
we  obtain  upon  taking  expectation  that 
N.  -  E(T.)  +  N.  T1  (1-p.). 

1  1  1  3 

Now  given  N^,  1  ■  1,  . M,  a  natural  estimate  of  p^  is  the  number  of 
errors  j  finds  divided  by  the  number  of  errors  in  the  chapters  read  by 
J,  that  is. 


Hence,  ve  see  that 


n(J) 

p  ^  —  • 

J  ~  In 

veLj  v 


N,  - 
i  ~ 


i  -  n  ( 1  -  °-(l} 
J«K,  T  H 

J  1  vcL  v 


i  -  1,  ....  M 


3.  Special  Cases 

(a)  M  ■  1,  K  >  1.  This  is  the  case  where  all  K  proofreaders  read 
the  entire  manuscript.  This  is  the  case  that  has  been  in  the  wild  life 
recapture  census  literature  (see  Seber  [4))  and  more  recently  by  Polya  (3) 
for  the  case  K  -  2,  Jewell  l 2 ]  for  K  >  2.  Equation  (4),  with  N  *  N^f 
becomes  the  single  classical  equation 


(6) 


N  * 


i-  n  fi  - 

j-i  n 


where  T  »  T^. 

It  is  known  (also  see  Corollary  1  to  Proposition  4  below)  that  If 
K 

max{n(j)}  <  T  <  7  n(j),  then  (6)  has  a  unique  root  in  the  interval 

J  j-1 

K 

[T,*];  if  I  n(j)  *  T  then  N  *  ®  and  if  T  -  max{n(j)},  then 

J-i  i 

K 

N  *  T.  If  ^7^  n(j)  >  T  then  condition  (5)  holds  and  both  Propositions 

1  and  2  apply.  Let  N  be  the  unique  finite  root  to  (6).  Since  the 

right  member  of  (6)  is  greater  than  the  left  member  at  N  *  T  (assuming 

max{n(j)}  <  T)  and  is  less  when  N  is  large  enough  (assuming 
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Xj  n(j)  >  T)  the  curve  defined  by  the  right  member  crosses  the  line 
defined  by  the  left  exactly  once  at  N  ■  N  from  above.  Thus,  viewing 
the  Iterative  procedure  graphically,  we  see  that  (N(u)}  +  whenever 
N(0)  <  N  and  (N(u)}  +  whenever  N(0)  >  N.  Since  {N(u)}  would  cease 
to  be  monotone  if  it  crossed  N,  the  increasing  sequence  as  well  as  the 
decreasing  sequence  must  converge.  The  only  point  they  can  converge  to  is 
N  -  N. 

M 

Suppose  {n'(j )}  such  that  max{n'(J)}  <  T  <  I  n'(J).  Let  N' 
be  the  root  of  (6)  when  n’(j)  replaces  n(j),  j  ■  1,  ...»  M.  We  have 

K  K 

Proposition  3:  If  (1  -  n'(j)/N))  >  (1  -  for  every  N  >  T, 

then  N'  >  N. 


Proof:  Let  N(0)  »  N’.  Then 


-  N(l)  . 


That  is,  N(l)  <  N(0),  implying  N(u)  +  N;  hence,  N  <  N'. 

If  it  is  assumed  that  P1  -  P2  ■  *  *  *  PR  *  p,  then  the  likelihood 
function  becomes,  since  M  ■  1, 


L(data  |  N,p) 


>°U)a 

(N-T)!d!  P  (1  p) 


- ~.~_ - -  3 

The  equations  for  the  maximum  likelihood  estimates  are  the  same  except 
that 

Pj  -P 

K 


Thus,  under  the  assumption  of  equal  p  's  equation  (6)  will  be  the  same 

1  K 

with  n'(j)  replacing  n(J)  where  n'(j)  ■  l  —  ^ 

j-1 

K 


Nov  ^position 


3  applies  since  II  (1  -  -  >  n  (1  -  for  ever  N  >  t;  this 

j-1  w  —  j-i  N  — 

follows  from  the  convexity  of  log(l-x)  in  the  interval  0  <  .  Thus, 

for  the  same  data,  the  assumption  of  equal  p^’s  always  leads  to  a  larger 

estimate  of  N. 

Asymptotic  variance  and  bias  for  the  estimator  N  can  be  found  in 
Darroch  [ 1 J • 


(b)  One  chapter  is  read  by  all  proofreaders,  all  other  chapters  are 
read  by  only  one  proofreader.  Here  K+l  »  M  >  1  ;  i  «  0,  M-l;  all 

proofreaders  read  chapter  0  and  only  proofreader  j  reads  chapter  j, 

J  -  1,  ...,  K. 

The  equations  (4)  become 


(7) 


K n(J ,0)+T 

l-  n  (1  -  — £ - 

J-1  *L+  N. 


0  "j 


N„+  N 


0  i  \  ^ 

TTToht^'  ti  ’  1  "  l> 


M-l 


The  second  part  of  (7)  is  equivalent  to 


(8) 


n(i ,0) 


Nq  ,  i  -  1,  ....  M-l  . 


Substituting  (8)  in  the  first  part  of  (7)  yields 


(9) 


(9) 


T 


0 


K 

l-  n  (l- 


nO.O+Tj  “ 
Nq( 1+Tj / n( j ,0) ) 


) 


i-  n  (l-  ) 

J-1  nq 


the  classical  equation  discussed  in  special  case  (a)  with  T  -  and 
n(j)  *  n(j,0).  Thus,  (9)  has  a  unique  solution  which  can  be 

obtained  by  iteration,  and  once  is  obtained,  ,  for  i  *  i, 

Kt  follows  by  ( 8 ) • 


4.  Results  of  Simulations 

In  general  when  M  >  1  the  usefulness  as  an  estimate  of  the  N^'s 
of  whatever  limits  result  from  use  of  the  iterative  procedure  is  in 
question  since  uniqueness  in  the  region  £  T^t  i  ■  1,  . M  has  not 
been  demonstrated.  Neither  has  any  results  pertaining  to  the  speed  of 
convergence  been  shown.  To  see  what  is  likely  to  be  the  case  some 
experiments  were  simulated  for  several  cases.  In  each  case  convergence  to 
a  unique  and  likely  value  of  appears  to  occur  and  the  convergence 
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takes  place  in  a  matter  of  seconds  when  using  a  modern  personal  computer. 
In  each  case  we  initiated  two  sets  of  calculations  -  one  starting  with 
1*^(0)  -  T^,  the  other  with  N^(0)  ■  (T^  +  C)  where  C  was  large  enough 
to  produce  a  decreasing  sequence.  In  each  case  the  calculations  lead 
rapidly  to  the  same  values  for  N  9  1  «  l9  ...»  M.  Specifically,  we  let 
M  -  K  -  4.  We  had  two  different  chapter-assignment  designs: 


where  a  1  or  0  occurs  in  entry  of  the  design  matrix  according  to 

whether  or  not  reader  j  is  assigned  to  read  chapter  i.  We  also  had  5 


different  probability  {p^}  combinations  for  each  design: 


Readers 

Combinations 

1 

2 

3 

4 

1 

.90 

.90 

o 

O' 

. 

.90 

2 

.10 

.15 

.75 

.80 

3 

.10 

.15 

o 

CM 

. 

.25 

4 

.10 

.15 

.20 

.75  1 

5 

.60 

.70 

.70 

o 

00 

. 

In  every  case  we  set  *  70,  i  ■  1,  4  and  C  ■  100.  We  take  as 

the  estimates  of  the  nearest  integer  to  the  limits  of  the  Iterative 

procedure.  The  number  of  iterations  required  to  reach  the  estimate  was 
taken  to  be  the  number  of  iterations  until  the  nearest  Integer  was 
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reached.  In  practice ,  more  ite  rlons  are  used  in  order  to  recognise  vhen 
the  procedure  appears  to  converge.  However,  the  length  of  real  time 
required  turns  out  to  be  negligible.  The  results  of  the  experiments  are 
summarized  in  Table  1. 

As  would  be  expected  the  accuracy  of  the  estimates  improves  with 
Increasing  p^*s.  This  would  be  expected  Intuitively  and  from  the  formula 
for  the  asymptotic  variance  of  H  given  by  Darroch  [1)  for  the  case  of 
M  -  1.  The  number  of  iterations  required  also  appears  to  decrease  with 
Increasing  p^« 


5.  Poisson  Model 

Assume  that  the  ratio  of  the  size  of  chapter  i  to  the  whole 

M 

manuscript  is  aJt  a,  >  0,  T  ar  ■  1.  Assume  the  number  of  errors 
i  i  -  f«i  i 

(N1#  •  •  • ,  N  )  are  independent  random  variables  with  a  Poisson  distribu- 

tion  having  mean  a  \t  i  ■  1,  . ..,  M  where  X,  as  opposed  to  the  a^,  is 

unknown.  Under  this  assumption,  following  Jewell  [2J,  the  likelihood 

function,  averaging  (1)  over  the  possible  value  of  {N  ,  N  } , 

1  M 

becomes 


M 

v 


(10) 


M  T  “x  Z  “iO'Qi)  K  p 

L(data  |  (\,p)  -  D  n  (Q  x  \)  1  e  1-1  n  (— 

i-1  1  j-1  l"Pj 


where  D  is  a  function  of  the  data.  Taking  partial  derivatives  of 


log  L,  we  see  that  the  maximum  likelihood  estimate  p^  of  X  and 
p^  must  satisfy 


Table  1 


Chapters 

Ectlnaee.  of  p^'s 

Readers 

Combinations 

i 

Di 

°2 

4 

Dl 

°2 

D1 

°2 

4 

°1 

D2 

] 

D1 

°2 

* 

i 

D1 

°2 

D1 

°2 

1 

H 

■i 

70 

71 

70 

70 

70 

71 

70 

71 

.95 

.92 

.90 

.92 

.90 

.88 

.90 

.91 

1 

No.  of  Iters. 

from 

1 

2 

1 

1 

1 

1 

1 

1 

No.  of  Iters. 

from  +  100 

3 

4 

3 

6 

3 

5 

3 

5 

67 

74 

73 

64 

71 

78 

72 

72 

.09 

.09 

.13 

.13 

.73 

.65 

.81 

.81 

2 

4 

12 

4 

10 

3 

10 

3 

9 

10 

15 

9 

19 

10 

17 

9 

14 

116 

33 

86 

34 

88 

66 

87 

46 

.07 

.16 

.10 

.19 

.12 

.23 

.20 

.38 

3 

1 

j 

31 

20 

38 

22 

35 

25 

35 

26 

i 

>40 

>40 

32 

>40 

34 

>40 

33 

37 

_  _  j 

80 

64 

64 

80 

73 

83 

62 

70 

1 

i 

.09 

.11 

.13 

.13 

.19 

.17 

1 

.76 

.79 

4 

j 

19 

25 

11 

39 

12 

26 

21 

21 

i 

\ 

19 

37 

26 

37 

22 

>40 

18 

>40 

71 

69 

71 

75 

69 

74 

71 

72 

.64 

.51 

.70 

.65 

.73 

■  i 

.77 

.76 

.78 

5 

1 

3 

2 

7 

2 

4 

1 

5 

5 

12 

4 

8 

4 

8 

5 

7 

13 


M 

Ai  ti 

M 

"llM 


P(J) 

X-  l  T 

WLJ 


“  1 M  » 


leading  to  the  equation  for  X  being  a  solution  of  the  equation 

H 

I  T 

f  _  i-1  * 


1-1  «t  n  ) 


Thus,  where  the  model  in  Section  2  leads  to  M  equations  in  M 

variables,  the  Poisson  model  reduces  the  number  of  equations  to  one 

equation  with  one  variable.  For  case  (a)  in  Section  3  (12)  reduces  to  (6) 

as  was  pointed  out  by  Jewell  (2).  The  same  numerical  iteration  suggests 

M 

itself.  If  X(0),  the  initial  value  of  the  iteration  is  J  T  ,  one  has 

i-l  1 

the  same  monotonicity  of  Proposition  1.  Numerical  calculations  indicate 

that  the  sequence  (\(u)}  will  converge.  The  question  of  uniqueness  of 

M 

the  solution  to  (12)  in  the  region  X  >  J  T,  is  open.  (Since 

M  "  ~  i-1  1 

n(J)  <  l  T  ,  X  >  l  T  implies  n(J)  <  X  -  y  T  • )  The  next  pro- 

-  vcLj  v  “  i-1  i  -  v/L.  v 

vide,  a  partial  answer  to  this  question. 


I  T  ,  J  -  1 . K 


f 


and 


T  -  l  T 
v-1  v 


Proposition  4:  If  for  every  1,  1  -  1 . M 


l  >  T-T/t 

k*j  1  Cj/T 

keKA 


then  there  Is  a  most  one  solution  to  (12)  In  (T,«). 


Proof:  Invert  both  sides  of  (12)  letting  z  *  1 / X  to  get 


z  -  G(z)/T 


where 


n(1 )z, 


G(z)  -  1  -  l  a.  n(l-  y^f)  . 
1-1  1  5<\  1_cj2 


We  shall  show  that  (13)  is  a  sufficient  condition  for  G(z)  to  be  a 
concave  function  in  the  interval  (0,1/T).  To  this  end,  the  first 
derivative  with  respect  to  z  is 


G'(z)  «  -  I  a  H ' ( z ) 
1-1  1  1 


where 


HiU)  “  J?K  f  1  "  ’  1  "  ••••  M  • 


But 


«:<*>  -  -  I  p.(*) 


where 


»,<«)-  n(.-fttJf)  JUi-j  ,  j  <  Kj  . 

j  U4j  1  Cl*  /!_«.  .\*  X 


uk‘  <l-Cjz)4 


Thus, 


HjU)  -  -  I  P' 

i  .  «»  J 


<*)  , 


j<K, 


but 


pfU)  -  -2?(J)--  n  -  -n-U>  ■  y  n  fi-t 

J  (1-CjZ)3  k*j  l"CkZ  (1-c^z)2  kj*j  v*k,j  I_cvz  1“ck* 


n(j) 


2c, 


y 


n(k) 


n  r.  _  n(v)z  i  {  3  _  ,  _ . _ . 

(1-CjZ)2  vi*j  l~V  1“cjz  kh  ^-(^(khzKl-c^) 


Thus,  the  sign  of  P'(z)  Is  the  sign  of  the  expression  In  brackets 

3  2cl 

which  Is  less  than  or  equal  to  - — ^-j=  -  l  n(k).  Then,  by  the  assump- 

1_CJ/T  k*J 

tlon  (13),  the  above  Is  non-posltlve  In  (0,1/T)  for  every  j  c  K^. 
Therefore,  H^(z)  j>  0  In  [0,1/r)  for  every  1,  and  consequently. 


G"(z)  <0  ,  0  <  z  <  i  , 


as  was  to  be  shown. 
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'I 


In  the  case  where  every  proofreader  reads  every  chapter, 

J  -  1,  K  (or  we  can  consider  M  ■  1,  and  (12)  becomes  (6)) 

have  the  already  known  result* 


-  0, 
and  we 


Corollary  1:  If  every  proofreader  reads  every  chapter,  then  (12)  has  at 
most  one  solution  in  the  interval  X  2  T* 

Proof :  Condition  (13)  is  satisfied  in  this  case* 

When  G(z)  is  concave,  equation  (12)  will  have  exactly  one  solution 
in  the  interval  X  >  T  if  G^O)/!  >1.  Since 


M 

G'(0)  -  -  J  <x  H'(0> 
1-1  1 


-  I  a  l  P  (0) 
1-1  jeK1  J 


M 

•  1  a  l  n(j) 

1-1  je^ 


we  have 

Corollary  2;  If  (13)  holds  and 

H 

I  ct  I  n( j )  >  T  , 

i-1  j 

then  there  exists  exactly  one  solution  to  (12)  in  the  interval  \  >  T, 
Even  if  (12)  does  not  have  a  unique  solution  we  can  bound  all  the 
roots  of  (12),  Let  X  be  the  unique  solution  to  (12)  in  [T,«)  when 
Cj  *  0,  J  *  1,  • • • ,  K*  We  have 
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X 


Corollary  3;  Let  X  be  any  solution  to  (12)  in  [T,*),  then  X  £  X. 

Proof:  This  corollary  follows  from  the  fact  that  the  right  side  of  (12) 
is  continuous  and  that  it  is  always  largest  when  c^  -  0,  j  ■  1 ,  • • • ,  K. 

The  previous  proposition  and  corollaries  deal  with  sufficient 
conditions  for  a  unique  solution  to  (12)  to  exist  and  an  upper  bound  X 
on  the  possible  solutions  in  case  there  may  be  more  than  one  solution. 

The  following  proposition  indicates  that  the  iterative  method  for  solving 
(12)  indicates  when  a  unique  solution  exists  or  at  worst  provides  bounds 
within  which  the  appropriate  solution  to  (12)  exists.  Let  X^  £  X^ 

<  •••  <  X^  be  the  roots  of  (12)  in  (T,»). 

Proposition  5:  If  X(0)  -  T,  then  lim  X(u)  -X,;  if  X(0)  *  X  then 

-  U+CD  1 

lim  X(u)*  X  •  Consequently,  if  the  two  limits  are  equal,  then  (12)  has  a 
u+°°  f 

unique  solution  in  (T,®);  otherwise  T  <_  X^  X  £  Xf  X. 

Proof :  Let  <t>(X)  denote  the  right  number  of  (12).  We  shall  exploit  the 

fact  that  $(X)  is  increasing  in  X  and  that  if  X(0)  -  T,  (X(u)}  is 

increasing  in  u.  Suppose  lim  X(u)  ■  X  >  X  .  Then  there  exists  a  u 

u+®  1 

such  that  X(u)  <  X^  and  X(u+1)  >  X^.  That  is,  we  have 

<t>(X(u))  -  X(u+1 ) 

>  xi 

-  *(XL)  , 

a  contradiction  of  the  fact  that  4(X(u))  <  The  proof  Is  analogous 

for  X(0)  -  X. 
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5*  Remarks : 


In  the  cases  where  we  have  not  shown  convergence  of  the  Iterative 
proceduret  numerical  examples  have  shown  convergence  quite  often  and  in  a 
matter  of  seconds  on  a  personal  computer*  However,  if  a  personal  computer 
Is  not  available  or  convergence  may  not  occur,  a  two-step  iterative 
procedure  suggests  Itself*  In  equations  (2)  the  estimate  of  would  be 

immediate  if  we  had  an  estimate  of  Q^.  If  we  go  beyond  the  sufficient 
statistics  {T^ f  n(j)}  such  estimates  are  available*  Let  denote  the 

number  of  distinct  errors  found  by  all  proofreaders  other  than  proofreader 
1  in  the  chapters  read  by  proofreader  1.  Within  this  set  of  distinct 
errors,  let  s^  denote  the  number  of  errors  found  by  proofreader  i. 

Then  an  estimate  of  is  given  by  s^/S^.  Inserting  p^  for 

the  p^'s  *n  yields  an  estimate  of  from  which  an  estimate 

of  follows  from  (2).  The  procedure  could  terminate  at  this  point  or, 

perhaps,  be  carried  on  one  more  iteration  by  setting  N^(0)  «  in  (4) 
and  then  letting  N(l)  be  the  final  estimate  N^. 

A  model  not  covered  in  the  paper  would  be  of  interest.  Errors  may 


fall  into  different  categories  where  the  Pj 's  f°r  eac^  reader  would  vary 
in  an  unknown  way  with  each  category.  If  the  categories  are  recognizable, 


then  the  present  model  can  be  adapted  -  treating  each  category  separately. 


However,  if  the  categories  are  not  recognizable  this  device  will  not  work. 
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